Goto

Collaborating Authors

 semi-discrete optimal transport problem


A Combinatorial Algorithm for the Semi-Discrete Optimal Transport Problem

Neural Information Processing Systems

Optimal Transport (OT, also known as the Wasserstein distance) is a popular metric for comparing probability distributions and has been successfully used in many machine-learning applications.In the semi-discrete $2$-Wasserstein problem, we wish to compute the cheapest way to transport all the mass from a continuous distribution $\mu$ to a discrete distribution $\nu$ in $\mathbb{R}^d$ for $d\ge 1$, where the cost of transporting unit mass between points $a$ and $b$ is $d(a,b)=||a-b||^2$. When both distributions are discrete, a simple combinatorial framework has been used to find the exact solution (see e.g.


A Combinatorial Algorithm for the Semi-Discrete Optimal Transport Problem

Neural Information Processing Systems

Optimal Transport (OT, also known as the Wasserstein distance) is a popular metric for comparing probability distributions and has been successfully used in many machine-learning applications.In the semi-discrete 2 -Wasserstein problem, we wish to compute the cheapest way to transport all the mass from a continuous distribution \mu to a discrete distribution u in \mathbb{R} d for d\ge 1, where the cost of transporting unit mass between points a and b is d(a,b) a-b 2 . When both distributions are discrete, a simple combinatorial framework has been used to find the exact solution (see e.g. In this paper, we propose a combinatorial framework for the semi-discrete OT, which can be viewed as an extension of the combinatorial framework for the discrete OT but requires several new ideas. We present a new algorithm that given \mu and u in \mathbb{R} 2 and a parameter \varepsilon 0, computes an \varepsilon -additive approximate semi-discrete transport plan in O(n {4}\log n\log \frac{1}{\varepsilon}) time (in the worst case), where n is the support-size of the discrete distribution u and we assume that the mass of \mu inside a triangle can be computed in O(1) time. Our algorithm is significantly faster than the known algorithms, and unlike many numerical algorithms, it does not make any assumptions on the smoothness of \mu .As an application of our algorithm, we describe a data structure to store a large discrete distribution \mu (with support size N) using O(N) space so that, given a query discrete distribution u (with support size k), an \varepsilon -additive approximate transport plan can be computed in O(k {3}\sqrt{N}\log \frac{1}{\varepsilon}) time in 2 dimensions.


Semi-Discrete Optimal Transport: Hardness, Regularization and Numerical Solution

arXiv.org Machine Learning

Semi-discrete optimal transport problems, which evaluate the Wasserstein distance between a discrete and a generic (possibly non-discrete) probability measure, are believed to be computationally hard. Even though such problems are ubiquitous in statistics, machine learning and computer vision, however, this perception has not yet received a theoretical justification. To fill this gap, we prove that computing the Wasserstein distance between a discrete probability measure supported on two points and the Lebesgue measure on the standard hypercube is already #P-hard. This insight prompts us to seek approximate solutions for semi-discrete optimal transport problems. We thus perturb the underlying transportation cost with an additive disturbance governed by an ambiguous probability distribution, and we introduce a distributionally robust dual optimal transport problem whose objective function is smoothed with the most adverse disturbance distributions from within a given ambiguity set. We further show that smoothing the dual objective function is equivalent to regularizing the primal objective function, and we identify several ambiguity sets that give rise to several known and new regularization schemes. As a byproduct, we discover an intimate relation between semi-discrete optimal transport problems and discrete choice models traditionally studied in psychology and economics. To solve the regularized optimal transport problems efficiently, we use a stochastic gradient descent algorithm with imprecise stochastic gradient oracles. A new convergence analysis reveals that this algorithm improves the best known convergence guarantee for semi-discrete optimal transport problems with entropic regularizers.